The Annals of Statistics 

2009, Vol. 37, No. 3, 1150-1171 

DOI: 10.1214/08-AOS601 

© Institute of Mathematical Statistics, 2009 

ASYMPTOTICS FOR SPHERICAL NEEDLETS 

By p. Baldi/ G. Kerkyacharian,^ D. Marinucci^ and D. Picard 

Universita di Roma Tor Vergata, LP MA Universite de Paris X, Universitd 
di Roma Tor Vergata and LP MA Universite de Paris 7 

We investigate invariant random fields on the sphere using a new 
type of spherical wavelets, called needlets. These are compactly sup- 
ported in frequency and enjoy excellent localization properties in real 
space, with quasi-exponentially decaying tails. We show that, for ran- 
dom fields on the sphere, the needlet coefficients are asymptotically 
uncorrelated for any fixed angular distance. This property is used to 
derive CLT and functional CLT convergence results for polynomial 
functionals of the needlet coefficients: here the asymptotic theory is 
considered in the high-frequency sense. Our proposals emerge from 
strong empirical motivations, especially in connection with the anal- 
ysis of cosmological data sets. 



1. Introduction. Over the last two decades, wavelets have emerged as 
one of the most interesting tools of statistical investigation. In this paper 
we give an application to the statistical analysis of data sets indexed by 
the unit sphere S^. This is motivated mostly by the analysis of the Cosmic 
Microwave Background radiation (hereafter CMB), currently a very active 
field of research in astrophysics. Every year hundreds of papers appear in 
physics journals about CMB and the interest on this topic is going to grow 
in the next few years when the ESA satellite PLANCK will provide a fresh 
flow of high-resolution data. Examples of spherical data appear also in other 
areas of the astrophysical sciences [see Angers and Kim (2005)] or outside 
astrophysics, that is, brain shape modeling and image analysis [see, e.g., 
Mardia and Patrangenaru (2005), Dryden (2005) and Dette, Melas and Pe- 
pelyshev (2005)]. 
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CMB data pose a large amount of challenging statistical problems, for 
instance, estimation of the correlation structure and of the parameters gov- 
erning this correlation, testing on the law of the field itself (which is predicted 
to be Gaussian, or very close to Gaussian, by leading physical models of the 
Big Bang dynamics), detection of outliers in the observed data (which may 
signal observations of noncosmological origin, i.e., so-called point sources), 
testing for isotropy and many others [see Genovese et al. (2004a, 2004b), 
Marinucci (2004) and Marinucci (2006)]. 

Random fields on the sphere can be investigated using Fourier develop- 
ments in spherical harmonics. These methods are, however, difficult to adapt 
when the data are known only on a portion of the spherical surface. This is 
actually the case of CMB data, as the observation of this field is missing in 
the equatorial region, due to the direct radiation from the Milky Way. 

In this paper we investigate the statistical properties of the so-called 
needlets. These are a family of spherical wavelets which were introduced 
by Narcowich, Petrushev and Ward (2006). Needlets enjoy several proper- 
ties which are not shared by other spherical wavelets. First they enjoy good 
localization properties in frequency: needlets are compactly supported in 
the frequency domain with a bounded support which depends explicitly on 
a user-chosen parameter. On the other hand, needlets enjoy excellent local- 
ization properties in real space, with an exponential decay of the tails (see 
Figure 2 for a typical graph). See Antoine and Vandergheynst (1999) and 
Antoine et al. (2002) for a different approach to spherical wavelets. 

As a major consequence of the localization property both in the frequency 
and in the space domain, the needlet coefficients are asymptotically uncorre- 
lated as the frequency tends to oo for any fixed angular distance. This is the 
first example of such kind of results for any type of spherical wavelets [see 
Baldi et al. (2008) for a similar result on the torus]. We use this key property 
to derive a central limit theorem and a functional central limit theorem for 
general nonlinear statistics of the wavelets coefficients. We discuss how from 
these results one can derive, for instance, procedures for testing goodness- 
of-fit on the angular power spectra. 

Let us stress again the great advantage of needlets: their ability (due to 
localization properties) of dealing with data known only on portions of the 
spherical surface. We remark also that the needlet construction does not rely 
on any sort of tangent plane approximation which is typically undertaken 
to implement wavelets on the sphere. 

The plan of the paper is as follows. In Section 2 we describe the construc- 
tion of needlets, following the approach of Narcowich, Petrushev and Ward 
(2006). In Section 3 we use them to investigate random fields on the sphere 
and derive the basic correlation inequality. In Section 4 we recall some clas- 
sical results on the diagram formula, that are needed in Sections 5 and 6 to 
derive the main convergence results. In Sections 7 and 8 we discuss statistical 
applications and the effect of missing observations. 



ASYMPTOTICS FOR SPHERICAL NEEDLETS 



3 



2. Construction of needlets. This construction is due to Narcowich, Petru- 
shev and Ward (2006). Its aim is essentially to build a very well-localized 
tight frame constructed using spherical harmonics, as discussed below. It was 
recently extended to more general Euclidean settings with fruitful statistical 
applications [see Kerkyacharian et al. (2007)]. 

Let us denote by the unit sphere of M'^. There is a unique positive 
measure on §^ which is invariant by rotation, with total mass Att. This 
measure will be denoted by dx. The following decomposition is well known: 

CXD 

(1) L2 = 07^,, 

where denotes the space of square integrable functions on the sphere 
with respect to dx, and TCi denotes the vector space of the restriction to 
of homogeneous polynomials on M^, of degree I, which are harmonic (i.e., 
AP = 0, where A is the Laplacian on R^). TCi is called the space of spherical 
harmonics of degree / [see Stein and Weiss (1971), Chapter 4; Varshalovich, 
Moskalev and Khersonskii (1988), Chapter 5] and has dimension 2^ + 1. The 
orthogonal projector on TCi is given by the kernel operator 

(2) V/GL2 PnJ{x)= f Li{{x,y))f{y)dy, 

where {x,y) is the standard scalar product of M.^, and Li is the Legendre 
polynomial of degree defined on [— 1,+1], verifying 

/I 2k + 1 

where 6i^k is the Kronecker symbol. Moreover, by definition of the projection 
operator, 

I 

Li{{x,y))= Yim.{x)Yimiy), 

where the spherical harmonics YimJ = 1,2,3, ... ,m = form an or- 

thonormal basis of 7ii. For an explicit expression of the functions Yjm [see 
Varshalovich, Moskalev and Khersonskii (1988), Chapter 5]. Let us point 
out the reproducing property of the projection operators 

(3) / Li{{x,y))Lk{{y,z))dy = 6i^kLi{{x,z)). 

The needlet construction is based on two fundamental steps: Littlewood- 
Paley decomposition and discretization, which are summarized in the two 
following subsections. 
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Fig. 1. Typical graph of (j) (dots) and b'^ (solid). Here B — 2. 

2.1. Littlewood-Paley decomposition. Let (phea C°° function on R, sym- 
metric and decreasing on M"*" supported in |^| < 1, such that 1 > > 
and (/)(^) = 1 if 1^1 < Let us define for an arbitrarily chosen B > 1 (see 
Figure 1): 

so that 

(4) viei>i E&'(Jj) = i. 

Remark that b{^) ^ only if ^ < \C\ < B. Let us now define the operator 
Aj = X]«>o ^^("gj)-^' ^^"^ associated kernel 




The following proposition is obvious. 

Proposition 1. For every / G L^, / = limj^oo^o(/) + E/=o^i(/). 
where Li{f) = y Li{{x,y))f{y)dy and Aj{f) = J Aj{x,y)f{y) dy. Moreover, 

if Mj{x,y) = J2i>oHw)^i((^^y))' ^^^^ 

(5) Aj{x,y) = J Mj{x,z)Mj{z,y)dz. 

2.2. Discretization and localization properties. Let 

I 

}^l= ^ 'Hm, 
m=Q 

the space of the restrictions to of the polynomials of degree less than I. 
The following quadrature formula is true: for all / G N there exists a finite 
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subset Xi C and positive real numbers A,, > 0, indexed by rj ^ Xi, such 
that 

(6) V/G/C, / f{x)dx=Y.Xr,f{i]). 

The operator Mj defined in Proposition 1 is such that 

z ^ Mj{x, z) G /C[5j+i], X ^ Mj{x, z) G /Cj^j+i], 

so that 

z ^ Mj{x,z)Mj{z,y) £ /C[2BJ+i] 
and, by the quadrature formula (6), 

Aj{x,y) = J Mj{x,z)Mj{z,y)dz= ^ A^M,(x,r/)M,(r/,y). 



[2BJ + 1] 



This implies 



Ajf{x) = J A,ix,y)fiy)dy = j \Mj{x,7^)Mj{7],y)f{y)dy 



We denote 



J2 y/K,Mj{x,rj) J ^r,M^{y,r])f{y)dy. 



and have 



c 



for some c > 0. We note Nj = y/^Zj. It holds, by Proposition 1, 

The main result of Narcowich, Petrushev and Ward (2006) is the following 
localization property of the i^j^rj, that are called "needlets": for any k there 
exists a constant such that, for every G 

where d{^,r]) = avccos{rj,S,) is the natural geodesic distance on the sphere. In 
other words, needlets are almost exponentially localized around any cubature 
point, which motivates their name (see Figure 2 in Section 8). Finally, notice 
that the construction in Narcowich, Petrushev and Ward (2006) is made 
with B = 2. We introduce here the free parameter B > 1, because in physical 
applications it may be useful in fine tuning the concentration in frequency. 
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3. Isotropic fields on the sphere and the needlet expansion. In this sec- 
tion we define the needlet expansion of an isotropic random field T on S^. 
We say that T is invariant by rotation (or isotropic) if 

Vp G 50(3) ^[T{px)T{py)] = ^[T{x)T{y)]. 

This is equivalent to the fact that the covariance function of the process is 
of the form 

E[T(x)T(y)]=K((x,y)), 

where is a bounded function defined on [—1, +1]. 

Throughout this paper we make the following assumption. 

Assumption 1. T is a centered Gaussian field, that is, mean square 
continuous and isotropic. 

Let us decompose K on the basis of Legendre polynomials: 

K = Y,CiLu Ci = ^!^ C K{u)Li{u)du. 

We write 

T{x)=J2Ti{x), 

l>0 

where 

Ti{x)= f T{y)Li{{x,y))dx 

(To = as the field is assumed to be centered). It is immediate that 

E[Ti{x)Tk{y)] = 6kiCiLi{{x,y)) for every x,ye S^. 

Actually all vectors in Hi are eigenvectors of the Karhunen-Loeve expansion 
of K{{-, •)). The previous projection can be realized explicitly as 

(8) Ti{x) = aimYim{x), 

m=—l 

where 

(9) aim. = j T{x)Yim{x)dx. 

{0'im)i,m is a triangular array of complex uncorrelated [but for the condition 
(— l)™a/m = CLi-m] r.v.'s and Ci is equal to the variance of aim- 

For every integer j, let Zj be the set of cubature points defined in the 
previous section. The points rj belonging to Zj will be denoted S^jk, k = 



ASYMPTOTICS FOR SPHERICAL NEEDLETS 7 

1, . . . ,Nj. Similarly we denote Vj.r? by ipj^k, and the needlet coefficient of a 
function /, (/, ?/^j^,j)L2 , by Pj^k- Hence the random needlet coefficients are 

Actually 

T{x)il}j^k{x)dx = / ^Ti{x)il)j^k{x)dx 



^J^^Ti{x)Li>{{x,^j^k))dx 
J^^Ti{x)Li{{x,Cj^k))dx 



I 

in view of the reproducing properties of the projection kernel. Hence 



E[/?j,fc/?j,fc'] = y>^j,k>^j,k'J2^'^y^ ]Ki{{Cj,k,(.j,k')) 

and 



_ ^/^JIh^ EibHl/B^)m^j,k,^j,k')) 
T.ibKl/Bi)CiLi{l) 

_ Eib\l/B^)mCj,k,^j,k')) 
Y.ib^{l/B^)CiLi{l) ■ 

We shall need to assume some regularity conditions on the asymptotic be- 
havior of angular power spectrum Q. 

Assumption 2. There exist M > 0, a > 2 and a sequence of functions 
{gj)j such that 



for every I such that ^ <l < B^^^, and positive numbers ci,C2,kr,r 
0, . . . , M, such that 

ci/"" < Q < C2/~", sup sup \g^[\u)\<kr, 

j B-^<u<B 
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where g^j \u) := -^gj{u) (a uniform bounded differentiability condition). 

Remark 2. Assumption 2 is a regularity condition on the asymptotics 
of the angular power spectrum which is trivially satisfied, for instance, if 
Ci = . Note that the sequence {gj)j belongs uniformly to the Sobolev 
space 

The following result is the basic localization inequality which plays a 
crucial role for the arguments below. 

Lemma 3. 

where, as hinted above, d{5^j^k,^,j^y) = arccos((^j^fc,^j^fc')). 



Proof. Observe first that, as we assumed that cil " <Ci < C2I 
(11) cii?(2-)i <J2b'i^^yMl) < C2i?(2-W. 

We recall the following bound for type II polynomials which is derived in 
Narcowich, Petrushev and Ward (2006), Theorem 2.6: 

where cm only depends on supj>ijj<jyj \\4>j''' Whence, using this for cl)j{x) = 
b^{x)x~'^gj{x) and (11), 

EibHl/B^)CiLiil) 

Y.ib\l/Bi)CiLi{l) 

^ T.lh''{l/B^)l~-g,{l/Br)Li{{i^,Uj,k')) 
Y.ib^l/B3)CMl) 

^ ElbHl/Bm/Bn~''9jil/BnLi{{C3,k,Cj,k')) 
Bi-Y.ib\l/Bi)CiLi{l) 

^ Cm_ 

-Jl + B^d{^~^J^^- □ 



- {l + Bd{x,y)) 
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Remark 4. As mentioned in the Introduction, the previous lemma high- 
lights a peculiar feature: the needlet coefficients at any finite distance are 
asymptotically uncorrelated. This property is at the heart of our results 
below. 



4. The central limit theorem for polynomial functionals of needlet coef- 
ficients. In the sequel, we make an extensive use of diagrams, which are 
mnemonic devices for computation of moments and cumulants of polynomi- 
als of Gaussian random variables. We adopt standard terms and notation 
(edges, nodes, connected components, . . .), and we refer to Surgailis (2003), 
Marinucci (2006) and Baldi et al. (2008), Section 5, for definitions and back- 
ground results. See also Nualart and Peccati (2005) for a more recent point 
of view. 

In the arguments to follow, we focus on polynomial functionals of the 
(normalized) wavelets coefficients, of the form 



(12) 



1 



Q 



T'U,Ni 



l3j,k 



J k=lq=l 



where u= 1,2, . . . ,U . Recall that Nj = yJ^Zj. Here Wuq are real scalars 
and Hq denotes the gth Hermite polynomial. As Hermite polynomials are 
an algebraic basis, every polynomial in the variables is of the form (12); 
we start from a general characterization of the behavior of the sequences 
hu,Nj- First we define the covariance matrix Qj, with elements 



v=l,...,U- 



(13) % 
Throughout the sequel we assume the following regularity condition: 



{E[hu,NjK,Nj]}u 



Assumption 3. There exists jo such that for j > jo the covariance ma- 
trix 0,j is invertible. 

Assumption 3 is a nondegeneracy condition on the asymptotics of the 
statistics of interest. Consider for instance the scalar case U = 1. From the 
diagram formula, it is immediate to obtain 



E[hlN,] = ^J2< Var (j2 HM,k)] 

3 9=1 \ k J 

E[Pj,kPj,k' 



1 Q ^1 



Ar2 

] q=l 



^uq 



E 

fc,fc'=i 



E[/3|,jE[/3^,,, 



The previous condition merely states that our nonlinear statistics have 
a nondegenerate asymptotic variance. Ruling aside pathological cases, it 
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should be noted that the previous assumption basically requires wl^p > for 
some p. In the multivariate case U > 1 we also require that the polynomials 
hu,Nj and h^^N. are linearly independent. It is to be noted, however, that 
the assumption fails for a polynomial of order 1. 

Theorem 5. Under Assumptions 1, 2 and 3, as Nj — > oo 

n;'/\hiN,,...,huNj N{o,iu), 

where lu denotes the identity matrix of dimension U . 

Proof. We note first that the multivariate result follows immediately 
from the case C/ = 1 , as by the Cramer- Wald device it is enough to focus on 
sequences of the form 

u 

u=l 

and 

J2u=l ^uhu,Nj J2u=l ^uhu,Nj 



However, it is clear that, for any choice of real numbers Ai, . . . , Xjj, 

k=lq=lu=l i k=lq=l 

where Wq := J2u=i ^uWuq- It is obvious that E[/Jm,7Vj] = 0. Hence to complete 
the argument it is sufficient to prove that, as Nj — > oo, 



lim E 



hN, V 



for p = 2,4,..., 
0, otherwise. 



VVar(/iAr, 

We must show that, as Nj — > oo and for all p > 3, 

Cum J^ Y.Y1 ^^9^<?(/3j,fcl), • • • . ]^ H II WqHq0j,k^) 
\ i k\ = lq=l i kp = lq=l / 

= Yl ■ ■■'^iP Cum — Hg.ipj^k,), ■■■,^Y1 Hgp((^j,k,) 

91V. ,9p ki=l ^ kp=l / 

Q 

j qi,---,qp GeVc(qi,...,qp) ki,...,kp=ll<u<v<p 
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7V2 

c 



sup E n i7.„..r-(^)-o, 

j 1^^-''iP GeVc{qi,...,qp) ki,...,kp = ll<u<v<p 

where rjuviG) counts the number of edges between node u and node v. It is 
then clearly enough if we can prove that 



,...,kp=l l<u<v<p 

^0(^2[(p-l)/2]^ 



for p = 3, 5, 7, ... . 
0(iVf-2), for p = 2, 4, 6,.... 



Now write 



X,,..,,(G):= ^ n hkM'-"^'''^- 

ki,...,kp=l l<u<v<p 

Note that each of the covariances is bounded by 1, so that Xqi---qp{G) is a 
nonincreasing function of r]uv{G)^ u,v = 1, . . . ,p. We modify iteratively the 
elements r]uv{G) by picking {u,v) at random, and then decreasing r]uv{G) by 
1; in graphical terms, this can be viewed as taking a new graph Gi where an 
edge between u and v has been deleted {Gi need no longer be connected). 
We repeat this procedure until (in a finite number of steps, T, say), we 
obtain a graph, Gt, such that the following circumstances are met: 

(a) There are no isolated nodes. 

(b) There exists at least a path covering three nodes. 

(c) The connected components do not allow loops. 

It is simple to see that we can reach Gt in a finite number of steps by 
the following algorithm: 

(1) We keep lowering r/„^ until we get to the point where the next step 
would necessarily violate condition (a) 

(2) If condition (b) is met, we stop our procedure. 

(3) If condition (b) fails, it means we have only components with two 
nodes and it is sufficient to raise by a unity any of the rjuv (i.e., to introduce 
an edge between two components). 

It is clear that there are at most [^^] such components. For brevity we 
assume that there are no paths with more than three nodes, the argument 
in the remaining cases being entirely analogous. We partition the nodes 
u = 1, . . . ,p into subsets Ii and I2 according to the following rule. All nodes 
that belong to more than one edge belong to Ii; then for components with 
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only two nodes we put the one whose index is smaller again into Ii. All the 
remaining others are put into l2- It is simple to check that the cardinality 
of Ii equals the number of unconnected components in Gt and hence is 
smaller than Since |7fc„A:„| < 1, we have 



l<n<-i;<p 



u<v<p 



\riuv{GT) 



IVuv (Gt) 



«<1I<P 

— I ^1! I ' 

u<v<p 

Note that by construction, appears exactly once in the covariances when- 
ever u£ I2', hence we obtain 



A;i,...,fcp=l nei'2 
u<v<p 



II l^k^kj 
u<v<p 



= Ii [ Y 

u<v<p 

Thus we obtain, using (10) and the following Lemma 6, 



Y Ii [ Y l7fe„fcj) 

u,uG/i nG/2 \fci,,«G/2 / 



u<v<p 



< Y 

ku,ueii 



£71 



Cm 



.k'ez 



M 



< J2 c = o{n; 



2[(p-l)/2] 



)■ 



kujUGh 



□ 



Lemma 6. If M > 3, there exists a constant C'j^ such that 



^^i'i^ + B^di^j,k,^j,k')) 
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Proof. It is proved in Narcowich, Petrushev and Ward (2006) that to 
get cubature points for polynomials of degree less than L, it is enough to 
take a maximal e-mesh on the sphere with e ~ 7; [i-e., a set {xi, . . . , xk} with 
d{xi,Xj) > e for Xi ^ xj and K maximal]. Using a simple covering argument, 
we have ^ K < \ b{^/2)\ where |-B(e)| is the volume of (any) ball of 

radius e, and |-B(e)| ~ ^, so K ~ L^. 

Let L ^ and the corresponding mesh defining Zj] as balls are disjoint 
and d{(,jk,x) < 2d{(^jk,(,jk') by the triangular inequality, we obtain 



Af2 

1 



]^^{l + B^d{C,,k,Cj,k')V' 



1 ^ f dx 



< 



\B{1/{2L))\ ^f^^hii^y,i/(2L)) (l + ^d(e,-,fc,ej,fc'))*' 

dx 



J^^jBi^^„,l/{2L)) (1 + Ld(e,,fc,x))*^ 



<CL'i ''^^ ^ = 2C.L^^^ ^^"^^^ 



§2 (l + Larccos((e,,fe,x)))A^ 7o {1 + 19)^1 



00 



= 2C,L^(^ + ^^^;^-l^)<2C.. ^ 

5. The functional central limit theorem. We are now ready to introduce 
the following continuous-time vector process: 

. [M 

Wj{t):=— J2 ^~'^\hi,N,,...,hu,N,y, 0<r<l, 

V <^ j=2,4,... 

where was defined in (13). Here J7 > 1 is a fixed integer. 

Theorem 7. Under Assumptions 1, 2 and 3, as Nj 00 

Wj =^ X, 

where X denotes the U -dimensional standard Brownian motion and =^ 
denotes weak convergence in the Skorohod space £'([0,1]^). 

Proof. We note first that the multivariate result follows from the case 
U = 1, as remarked above. It is well known that in order to prove weak con- 
vergence we have to establish convergence of the finite-dimensional distri- 
butions and tightness. By the Cramer-Wald device, to establish the former 
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it is enough to focus on sequences of the form 

[Jr] U 



1 

71 
1 



[Jr] 

J=2,4,- 
[Jr] 



"77 E E 

V J j=2,4,...«=l 



Z]l-=2,4,... Yl,u=l ^uhu,N. 



U 



-fj E 

^'^ i=2,4,... 



Z]i=2,4,... Z]u,i)=l ^uKEhu,Njhy,Ni 



v^[Jrj Y^cy \ L 
^i=2,4,... Z^-u=l ■^u'l'u, 



However, it is clear that for any choice of real numbers Ai, . . . , Xu, 

1 ' 



-Jj E i E E E 

V 7=2,4,... { 3 



[Jr 



7V2 
1 ' 



77 E ]^EE^.^.(/5.>) ' 

V^,=2.4,... fc=lg=l J 



where, as before, Wg := X]n=i '^"""^wg- On the other hand, a necessary and 
sufficient condition for tightness of vector processes is tightness for the com- 
ponent processes. Without any loss of generality, we can hence focus on the 
univariate case U = 1. We first consider convergence of the finite-dimensional 
distributions. It is straightforward to see that h^.^hN., are independent 
whenever \j — j'\ > 2. As the process Wj{r) is a partial sum of independent 
elements, convergence of the finite-dimensional distributions follows from 
the Lyapunov condition 



(14) 

We have 



lim 

J— ►oo 



^[Jr] 
'^J=2,4,, 



J2^2 



E 



Q 



N. 



EE^^-^^fe) 



3 k q=l 



E 



E 



E n 



rjuv (G) 



LGeV((}i,...,(j4) 'J k-ik2k-ikil<u<v<i 
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+ \HhkiPj,k3MPj,k2PjM]\ 

+ \E[f3j,k^f3jMM^j,k2hk4\\} 



< c 



J kik2 



uniformly over j, in view of Lemma 6. Equation (14) then follows easily from 



Si-=2.4....E^Ar^ [Jr] 



0. 



Likewise, by a well-known result, tightness follows from 
E[|M^j(ri) - Wj{r)\^\Wj{r2) - Wj{r)\^] 



1 



C 



■ [Jr-2] ^ 
L \i=[Jr]+l 



2n 



E 



/ [Jr] ^ N 

E 

Vj=[Jri]+l / 



< ;^(['/^2] - [Jr])([Jr] - [Jn]) < 4C7(r2 - ri)^ 
for all ri < r < r2, again in view of Lemma 6. □ 



6. Statistical applications. In this section, we use the previous results to 
derive goodness-of-fit for spherical random fields. In particular, we take 



1 ' 



1 ' 



:Y.H2{(3,,k) = j^Y.{(^lk-i}, 



1 



J k=i 



J k=l 



7V2 
1 ' 



^2^. = Y.WPj,k) + 3i/i(/3,,fe)} = ^ E f^lk, 



1 



J k=l 



k=l 

1 ' 



J fc=l 



It is natural to view /iiat. as a goodness-of-fit statistic on the angular power 
spectrum {C;}. More precisely, a typical question arising in applications is 
to check the validity of a physical model (e.g., specific values of parameters) 
by means of a comparison between the expected and observed angular power 
spectrum. In this framework, this goal can be accomplished as follows: recall 
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that 



where 



Then it is clear that /iijy^. provides a measure of discrepancy between the 
expected and observed values of an averaged power spectrum. In order to 
construct feasible statistical procedures with an asymptotic justification to 
investigate this and other hypotheses of interest, we define 

Wj{r) = ^ ^ n-^'\hi^N,,h2,N,M,N,)\ 0<r<l, 

where as before 

% = {^[hu,NjK,Nj\}u,v=l,...,U 

and 

E[/i?,;v,] = Var(/ii,^J = i_ Var ( ^ /72(^,>) ) = 4 E (E[^j,fci^i,fc.])', 

3 \ k / i fcifc2 

E[/ii,7V,] = Var(/i2,iv,) = ^ Var (^^{i^sl^fc) + 3^i(^,>)}) 

6 ---39 ^ ^ 

= 7^ E (E[^i,fci/?j,fc2]) + ]^ E E[/3j,fci^j,fc2]i 

3 k\k2 3 k\k2 

E[/ii,7vJ = Var(/i3,^^,) = -L Var (^5]{i74(4fc) + Gi^sfe)}^ 

24 ^ ^ 72 ^ ^ 

= ^ E (E[/3,,fci/5,,fe])' + ]^ E (E[/?,,fci/3,,fe])'- 

3 k\k2 3 k\k2 

Also 

E[/ll,7V,/i2,7V,] = ]^ E E[{i^3(4>i) + 3i^l(^i,fcJ}i^2(^i,fc2)] = 0, 
3 k\k2 

E[/ii,7V,/i3,7V,] = ]^ E n{HA{kM) + ^H2{hk,)}H2{hk2)] 

3 k\k2 

= ^E(E[4fci3,,fe])' 

J fclfc2 
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and 

As from Theorem 7, as J — > 00 Wj converges in D{[Q,lf) to a three- 
dimensional standard Brownian motion X, by focusing on the first row of 
Wj, we obtain for instance the Kolmogorov-Smirnov type test, that is, 

hm P[ sup \Wi^j{r)\>t \ = pi sup |Xi(r)|>t 

J^oo \o<r<l ' / \0<r<l 

Xi denoting the first component of X. The derivation of threshold values for 
t is then standard. Similarly, it is possible to construct tests for Gaussianity 
and isotropy based on the skewness and kurtosis statistics h2Nj and h^Nj , 
respectively. The numerical implementation of these procedures on CMB 
data is currently underway. 

7. Missing observations. As mentioned in the Introduction, we expect 
needlets to be extremely robust in the presence of partially observed spheri- 
cal random fields, due to their excellent localization properties in real space. 
This result can be formalized as follows: we assume we observe T(^) = 
T(^) + V{£,), where V{(,) is a noise field that need not be independent from 
r(^); indeed the most relevant case is V{£,) = —T{(,)l^^^Qj, G C denot- 
ing the unobserved subset of the sphere. This situation arises when the field 
is not observed (and hence its value is set to zero) for some locations in 
the sky. This is the situation with CMB data in the so-called galactic cut 
region, where CMB is dominated by the Milky Way emissions. We note 
^eiCj,k) '■= G ■d{(,,^,j,k) ^ e} the neighborhood of radius e around the 
cubature point (,j^k, d denoting as usual the angular distance. We write 



for the wavelets coefficients of T. The following result highlights the robust- 
ness property of needlets. 

Proposition 8. Let ^j^k be a cubature point such that V{^) = on 
^e{^j,k) o-f^d assume that 

(15) supE[y(e)^]=:l^*<oo. 
Then, for every M G N, 

CMA^T^/W*B^ 



3jk - Pjkh ■■= \/E[iPjk- f^jkV] < 



{1 + B3e 



18 



BALDI, KERKYACHARIAN, MARINUCCI AND PICARD 



Proof. We have, by the locahzation property (7), 



Therefore 



Remark 9. Remark that in Proposition 8 y is not assumed to be 
isotropic. Thus in the case of gaps (15) is obviously satisfied, as E[T^(C)^] ^ 
E[r(^)^] for every ^ G 5^. It is also interesting to stress that, in view of (11), 



with C'j^ = IciCmIttVW*. For M large enough, it is not difficuh to show 
that, up to different normalizing constants, the limit results in Sections 4, 5 
and 6 are not affected asymptotically by the presence of sky cuts. Although 
this result must be taken with a good deal of common sense when working 
with finite-resolution experiments, we view this property as a very strong 
rationale to motivate the use of needlets in cosmology and astrophysics. 

8. Numerical implementation. In this section we address some practi- 
cal issues concerning the implementation of needlets on real data. In par- 
ticular we consider data on the Cosmic Microwave Background radiation, 
as provided by the NASA experiment WMAP. It is not difficult to devise 
some kernel construction that fulfills the conditions highlighted in Section 
2. As in Marinucci et al. (2008), we suggest the following algorithm [cf. 
Guilloux, Fay and Cardoso (2007) for alternative suggestions]. 

In order to construct the function ip of Section 2.1 one just defines f{t) = 
exp(— Y3p-) for — 1 < t < 1 and = otherwise. / is obviously C°° and com- 
pactly supported in the interval [—1,1]. We then construct the function 

I-ifit)dt 



I-ifit)dt 



ip is C°°, nondecreasing and s.t. ^(—1) = 0, V'(l) = 1- Then the function 
ip is obtained easily by joining and 1 with ■0 suitably rescaled. Remark 
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Fig. 2. Typical graph of the needlet and a corresponding spherical harmonic (dots) as 
functions of the geodesic distance. Here j = 5 and B — 2. The localizing effect of the 
Littlewood-Paley device is remarkable. 

that in practice one needs only to compute the function b at the points 
-Jj. Therefore, once the maximal-resolution is known, these values can be 
computed and stored once for all. An instance of a needlet function is given 
in Figure 2. 

The random needlet coefficients are now evaluated as 



The practical implementation of (16) on a given random field requires the 
evaluation of its spherical harmonic coefficients (aim)- In principle, the lat- 
ter can be recovered by means of (9). In practice, in applications such as 
CMB data analysis the random field is continuously observed by means of 
antennae which average observations over tiny equal-area regions covering 
the whole sky; the resulting values are projected on a discretized grid, where 
the locations of points in the grid are chosen in order to make possible the 
approximation of (9) by means of cubature formulae; a standard package for 
this routine is HealPix, described in Gorski et al. (2005). The final output of 
this algorithm is indeed a triangular array of coefficients (a^m), but one may 
wonder whether numerical approximations may indeed spoil the validity of 
the theoretical results presented in the previous sections. 

To investigate this claim, we produce some numerical evidence on one of 
the key properties of random needlet coefficients, that is, the uncorrelation 




(16) 
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Fig. 3. Decay of correlation on a CMB-like map, j and j' are on the two axes. 

across different scales (Figure 3). We simulated 100 independent copies of 
a random field, using the expansion (8). The coefficients aim were sampled 
as independent [but for the condition {—l)^aim = «i,-m] complex Gaussian 
r.v.'s with variance Ci. The results look encouraging: the actual correlation 
for all j — j' >2 is in the order of 0.1-1%, which is indeed consistent with 
theoretical predictions, up to minor rounding errors. 

We also performed some Monte Carlo experiments on the effect of missing 
observations on the values of the needlet coefficients. More precisely, for 
different types of sky gaps, we provide estimates of the quantity 

First we mimicked the experimental data on the CMB radiation, as described 
for instance by the WMAP team (see http://map.gsfc.nasa.gov/). In 
particular, data on CMB are contaminated mainly by the presence of the 
Milky Way (which is located around the equator, in the standard choice 
of coordinates) and several so-called point sources, amounting basically to 
known clusters of galaxies which produce a radiation unrelated with CMB. 
To remove these emissions, the WMAP team has set to the value of the 
field in a certain region, which is known as the KpO mask. 

We simulated again 100 independent copies of a random field. The func- 
tion Ci was chosen in order to mimick the best fit from satellite observations 
of CMB [see Pietrobon, Balbi and Marinucci (2006) for details]. We fixed 
B = 1.5 and j = 11, corresponding to a range of frequencies from / = 58 to 
I = 129. We then estimated both the needlet coefficients Pjk (in the pres- 
ence of missing observations) and Pjk (for the completely observed field) and 
evaluated the gap between the two using the discrepancy Dji^ of (17). 
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The results are displayed in Figure 4, where directions corresponding to 
a value of Dji^. > 0.1 are marked with a black dot. Note that, even for such 
small values of j, the difference between /3jfc and f3jk is rather small; indeed 
Djk is above the threshold in approximately 20% of the cubature points. As 
expected, these points cluster in the neighborhoods of the mask. Refer to 
Guilloux, Fay and Cardoso (2007) for further numerical evidence. 

Acknowledgment. We are grateful to D. Pietrobon for providing the nu- 
merical results in Section 8. 
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